The objective of this module is to discuss the function and application of animal movement models, specifically, Contiuous-Time Movement Models (ctmm), and explore measures of overlap.
You are at the bar, and you just got you and your friends a third round of drinks. You turn around and the dance floor is packed like sardines, and your friends are at the very front next to the DJ booth. How do you get from where you are to where you’re going? Maybe you take a step to the left, two steps forward, another back and to the right. Each step you take is independent from the last, as the dance floor is moving, grooving and no one is staying still. This goes on (and on and on) until you have made your way back to your friends.
Some movement ecologists would consider this navigation an example of a random walk model. Random walk models assume that in a time period, an individual will take random and independent steps that are identically distributed in size and away from their previous position (Nau 2014). However, these steps do depend on the location of your previous step, with each step the starting point of your next.. that is - unless you’ve figured out teleportation.
Okay. Now, you’ve made it home from your run and you want to see the data on your fitness tracker app. But, OH NO! You must have accidentally turned it on last night when you were leaving the bar. It tracked your uber ride home, when you got out of bed to get a glass of water, when you laid on the bathroom floor for a while, and then the run. That is a lot of movement data!
This, like any GPS monitoring system, is an example of continuous-time stochastic process modeling. That is, this is a model that accounts for finely sampled and continuous modern data collection (Calabrese et al. 2016).
CRW was so focused on the sampling schedule that it neglected continuous autocorrelated movement. CTSP models, on the other hand, are able to separate these factors while taking both into consideration. CTSP models are able to accommodate a range of data characteristics including irregular sampling schedules and varying levels of autocorrelation. However, CTSP modeling can be statistically complex and lack of comprehensive software to process this kind of data made it relatively inaccessible.
Fortunately, the continuous-time movement modeling (ctmm) package for R was created. ctmm allows a broad array of movement related data analyses including diagnostic data visualization and model fitting. These parameters can be used to analyze various metrics related to animal movement. In this module, we will examine evidence of range residency, calculate home range size with confidence intervals, and visualize the overlap between the home ranges of two coyotes.
‘Everything is related to everything else, but near things are more related than distant things’ - Waldo R. Tobler
As we have discussed in class previously, statistical analyses often rest on the assumption that data are independent and identically/normally distributed. However, we have also explored how such discrete independence is often not the case (like with our introduction to regression modeling). Investigating autocorrelation, the similarity of near observations, can reveal patterns and relationships that can better inform the selection of appropriate statistical analyses.
Spatial data have a strong tendency to be dependent, meaning near values are more/less similar than expected compared to randomly associated pairs of observations. When dealing with spatial data, like the animal movement data we are exploring here, spatial autocorrelation is a critical concept that drives model fitting and analysis.
We will continue to explore this concept in greater detail as we dive deeper into the components of ctmm modeling which help ensure our spatial models are statistically meaningful.
ctmm & MovebankRecently there has been a rapid development and advancement of the technology associated with movement ecology (Calabrese et al. 2016). The discrete-time correlated random walk, or CRW, had been the standard tool utilized for the statistical analysis of this data. Yet, this methodology is not precise and often confounds the movement process thus producing different results even when the exact same path is being sampled. This is because CRW reflects the sampling schedule more than it does the actual process of movement (Calabrese et al. 2016). The package ctmm solved this problem via the use of a continuous-time stochastic process (Calabrese et al. 2016).
Movebank is a free online tool that allows individuals to store and archive animal movement data to be used by other scientists and lay people alike (Mrozewski 2018). This website is an especially useful tool for cross species comparisons, as well as comparisons over time, and climate/ecological change (Kranstauber et al. 2011). You can explore the website and various publicly available datasets for yourself here!
Understanding animal movement is crucial when studying population ecology, disease spread, the flow of genetic material, and conservation among other things (Calabrese et al., 2016). In so far as conservation goes, this kind of data allows us to understand how animals are using their environments, and what can be considered their home ranges. With this information, we can designate certain areas as protected, create corridors and encourage reforestation of degraded areas.
As shown in the background literature, advancements in movement modelling have allowed for more accurate home range estimates that account for both the time dimension and autocorrelation in global position system (GPS) data. Below is an example of the implementation of the R package {ctmm} (Contnuous-Time Movement Modeling) to calculate the autocorrelated kernel density estimates (AKDE) of coyotes (Canis latrans) from publicly available movement data shared on Movebank (Mahoney & Young, 2017). These data were collected via Lotek GPS collars (Model GPS3300S; Lotek, Newmarket, ON, Canada) on 8 coyotes in Idaho between 2004-2015. If you would like to also play around with this massive data set, take a look at the data repository on Movebank.
We will begin by loading in a subset of the Mahoney & Young 2017 data. For the purpose of this module, we are only using data from ~ 1 month of the study (January 2005). The GPS collars used in the study recorded data every five minutes for over ten years and no one likes code that takes days to run :’)
Let’s first load our packages and then use {curl} to save the Movebank GPS data into the new object bigdata
library(ctmm)
library(curl)
f <- curl("https://raw.githubusercontent.com/vzdanowicz/AN588_ctmm/main/thisisit.csv")
bigdata <- read.csv(f, header = TRUE, sep = ",", stringsAsFactors = FALSE)
The {ctmm} package was designed explicitly to go hand in hand with Movebank. For any ctmm function to operate, data must be in Movebank format and coerced into a telemetry object.
Telemetry objects in {ctmm} contain information about the movement being analyzed including GPS location and sampling schedule. Objects from the {move} package in R can also be utilized in ctmm by using the as.telemetry function. Once in the proper format, movement data can be manipulated and visualized for data characterization and analysis.
coyotes <- as.telemetry(bigdata)
# Changing the names of the coyote IDs to make coding easier!
names(coyotes)<-c('coyA','coyB','coyC','coyD','coyE','coyF','coyG','coyH')
As with any new data - it is important to explore!
Plotting and visualizing raw movement tracks is a great way to assess your data as a whole. In initial visualization you can detect obvious migratory behaviors, directional patterns, or perhaps even erroneous data points that require attention. This step can also aid in data selection for modeling.
Let’s visualize…
par(mfrow = c(1, 1))
plot(coyotes, col = rainbow(length(coyotes)))
title("All Coyotes")
From this plot, we can see pretty distinct ranges for the 8 individuals and there do not appear to be any unnatural patterns or random points that could skew future analysis.
Though you can use ctmm models to analyze cumulative data from an entire population, for the sake of time, we are going to build our model using the data for a single coyote, coyG.
Let’s begin by pulling out all of the movement data for this individual to create and then plot the object coy1
coy1 <- coyotes$coyG
plot(coy1)
We can see from the fairly tight clustering of GPS points, that this coyote appears to be range resident with no obvious migrations. The next step of our workflow is to calculate and plot variograms from the GPS data for this individual.
Variograms are plots of the semi-variance between positions with varying time-lags, providing an unbiased way to visualize the autocorrelation of our data. The structure of the variogram provides critical information on the movement of an animal, as well as what method of modeling will fit the data best.
Arguably, the most crucial element of a variogram is if the semi-variance reaches an asymptote. This indicates that an animal is range-resident, which allows our range calculation to be meaningful.
The code below employs the function variogram() to calculate and then plot the variogram of telemetry object, coy1.
vg.coy1<-variogram(coy1)
#plotting long/short lag variograms
par(mfrow = c(1, 2))
plot(vg.coy1)
title("Long Lag")
plot(vg.coy1,fraction=0.005)
title("Short Lag")
The asymptote of the variogram for coy1 at longer (day) lags supports our visual analysis that coy1 is range-resident, while the slight upward curve of the short (minutes) lags show evidence of directional persistence in the data.
Variograms are used to determine the initial parameter guesstimates which are then utilized in maximum likelihood estimation, which we will describe in more detail later on.
CHALLENGE 1
Follow the steps outlined above to calculate and plot the variograms for a different individual, coyH.
coy2 <- coyotes$coyH
plot(coy2)
vg.coy2<-variogram(coy2)
par(mfrow = c(1, 2))
plot(vg.coy2)
plot(vg.coy2,fraction=0.005)
We can then use either variogram.fit() or ctmm.guess() to generate our initial parameter guesses/estimates. These variables will later be passed to ctmm.select or ctmm.fit to better assess the model.
If run in your console, variogram.fit() allows you to visually assess and adjust parameters via the manipulate gear in top left corner of the plot pane.
NOTE: Though you can save these parameters for later use in model fitting by selecting ‘save to GUESS’ from manipulate, the code below already includes the creation of object
GUESSwithout needing to save thevariogram.fit()parameters from the plot pane.
variogram.fit(vg.coy1)
GUESS<-variogram.fit(vg.coy1)
We can also use ctmm.guess() and the variogram for coy1, vg.coy1, to estimate initial model parameters.
coy1_GUESS <- ctmm.guess(coy1,variogram = vg.coy1,interactive=FALSE)
These parameters, GUESS and coy1_GUESS, are critical in our ability to assess the most suitable model type for the data at hand. We will next explore how ctmm.select and ctmm.fit allow us to test and visualize potential model types.
In order to use autocorrelated kernel density estimation (AKDE) to perform accurate home range analyses, we have to ensure that our data indicates range residency and then choose the best model to represent the data. The first step as discussed above is to use a variogram to visualize whether movement is confined to a given space or home range. Then, a movement model is chosen.
The independent identically distributed (IID) process assumes each data point is independent and random. However, animal movement is automatically correlated since one’s location in space is dependent on continuous movement (i.e. coyotes also can’t teleport). This means the IID model is not a good model for our purposes despite it’s use in older kernel density estimation metrics. The brownian motion (BM) process assumes random movement and diffusion in unlimited space and is good for data not comprehensive enough to demonstrate home range or velocity autocorrelation. Typically, BM processes are not suitable for our purposes. However, the Ohrnstein-Uhlenbeck (OU) process uses BM but assumes fidelity to a general location and is suitable for data which shows signs of residency. What this model lacks is the ability to incorporate autocorrelated velocities. The integrated OU (IOU) process solves this problem for data with high resolution but is unable to demonstrate range residency. All of these shortcomings are solved in the Ohrnstein-Uhlenbeck Foraging (OUF) process which combines OU and IOU processes to enable analysis of data demonstrating both velocity autocorrelation and range residency.
This is especially relevant for improved data sampling methods which are able to combine long sampling periods and high resolution of sampling points. Given the extensive dataset used here, we can expect that our best fit movement model might be the OUF model. We also have to differentiate between the accuracy of isotropic versus anisotropic models for each movement model. By running ctmm.fit and ctmm.select, we can produce data to determine the best fit model. Once we have decided between isotropic and anisotropic, we can proceed to visualize those model types to confirm the results of the initial model guess.
variogram.fit & ctmm.guessLet’s input the initial parameter estimates calculated from variogram.fit (GUESS) and ctmm.guess (coy1_GUESS) to compare the two methods.
vg.fitted.mods <- ctmm.select(coy1,CTMM = GUESS,verbose = TRUE) #can take 5+ mins to run
guess.fitted.mods <- ctmm.select(coy1,CTMM = coy1_GUESS,verbose = TRUE) #can take 5+ mins to run
summary(vg.fitted.mods)
summary(guess.fitted.mods)
Both methods yield the same model fittings. Thus - either are acceptable for initial parameter guessing.
We can see from the summaries above that the anisotropic version of OUF is the ideal model for our data. We can also see that the anisotropic versions of each model were favored over their isotropic counterpart (we are looking for lowest AIC values). We can now visually examine the fits of the anisotropic versions of OU and OUF models.
To do this, we must first extract out of the fitted anisotropic versions of the OU and OUF models
NOTE: as shown above,
vg.fitted.modscould also be used to pull out the model types
OU<-guess.fitted.mods[[4]]
OUF<-guess.fitted.mods[[1]]
ctmm.fit accepts a ctmm object as well as the guesses generated by variogram.fit/ctmm.guess (in our case, OU and OUF) and generates values for comparing the fit of each model. These values include: point estimates, confidence intervals, and AICc. Based on this output, the user can select the most appropriate model for their data. We explain some of these parameters as we go along.
M.OU <- ctmm.fit(coy1,OU) #these fits may take ~2 mins
M.OUF <- ctmm.fit(coy1,OUF)
FITS <- list(OU=M.OU,OUF=M.OUF)
summary(FITS)
AICc is the (linearly) corrected Akaike information criteria. AIC balances likelihood against model complexity in a way that is good if we want to make optimal predictions. A lower AIC is better, thus the OUF model seems better here.
The fit parameter DOF[mean] is the number of degrees of freedom worth of data we have to estimate the stationary mean parameter, assuming that the model is correct.
par(mfrow = c(2, 2))
plot(vg.coy1, CTMM=OU, col.CTMM = '#1b9e77')
title("OU")
plot(vg.coy1, CTMM=OU, col.CTMM = '#1b9e77', fraction = 0.005)
title("OU")
plot(vg.coy1, CTMM=OUF, col.CTMM = '#1b1e77')
title("OUF")
plot(vg.coy1, CTMM=OUF, col.CTMM = '#1b1e77', fraction = 0.005)
title("OUF")
We can see from the figures above that OUF is a better fitted model (though not perfect!). The data we are using here are a subset of a much larger body of data. We reduced the number of observations so that the ctmm models would run in a reasonable amount of time, but it is likely that the OUF model would look even better fitted if more observation points were included. Additionally, even though the OUF model is not perfect in the shorter time lag plot, the overall trend over multiple days (longer time lag) is adequate.
par(mfrow = c(1, 2))
plot(vg.coy1,CTMM=FITS,col.CTMM=c("red","purple"),fraction=0.65,level=0.5)
title("zoomed out")
plot(vg.coy1,CTMM=FITS,col.CTMM=c("red","purple"),fraction=0.05,level=0.5)
title("zoomed in")
We can see the OU (pink) and OUF (purple) models plotted here against the variogram we created earlier. The zoomed in figure shows the purple line, the OUF model, follows the variogram more closely compared to the OU model.
The previous steps revealed the OUF model is best suited for our data. We can now use ctmm.fit to fit our model via maximum likelihood. This function processes a telemetry object and model specification (i.e. OUF, OU, IID, etc.), returning the maximum likelihood parameter and interval estimates. The objects returned by ctmm.fit will allow us to compare the fitted model to the data.
this can take ~2 mins to run
coy1_FIT <- ctmm.fit(coy1, CTMM = OUF, method = "pHREML", control = list(method="pNewton",cores=2), trace = TRUE)
summary(coy1_FIT)
NOTE:
CTMM = OUFbased on our previous figures/calculations which show that is the best fit model compared to the OU model. Additionally, the argumenttrace = TRUEprovides a progress report in the console window asctmm.fitruns.
CHALLENGE 2?
variogram.fit(vg.coy2)
coy2_GUESS <- ctmm.guess(coy2,variogram = vg.coy2,interactive=FALSE)
coy2_SELECT <- ctmm.select(coy2,CTMM = coy2_GUESS,verbose = TRUE)
summary(coy2_SELECT)
OU2<-coy2_SELECT[[4]]
OUF2<-coy2_SELECT[[1]]
M.OU2 <- ctmm.fit(coy2,OU2)
M.OUF2 <- ctmm.fit(coy2,OUF2)
FITS2 <- list(OU=M.OU2,OUF=M.OUF2)
summary(FITS2)
par(mfrow = c(2, 2))
plot(vg.coy2, CTMM=M.OU2, col.CTMM = '#1b9e77')
title("OU")
plot(vg.coy2, CTMM=M.OU2, col.CTMM = '#1b9e77', fraction = 0.005)
title("OU")
plot(vg.coy2, CTMM=M.OUF2, col.CTMM = '#1b1e77')
title("OUF")
plot(vg.coy2, CTMM=M.OUF2, col.CTMM = '#1b1e77', fraction = 0.005)
title("OUF")
coy2_FIT <- ctmm.fit(coy2, CTMM = M.OUF2, method = "pHREML", control = list(method="pNewton",cores=2), trace = TRUE)
summary(coy2_FIT)
both.coyotes <- coyotes[c("coyG","coyH")]
#variograms
variograms <- list(vg.coy1, vg.coy2)
plot.vg <- lapply(1:2, function(i) plot(variograms[[i]],fraction = 0.5))
GUESS_2 <- list(guess.fitted.mods, coy2_GUESS)
FITS_2 <- list(coy1_FIT, coy2_FIT)
par(mfrow = c(2, 2))
plot(variograms[[1]],CTMM=FITS_2,col.CTMM=c("purple"),fraction=0.65,level=0.5)
plot(variograms[[1]],CTMM=FITS_2,col.CTMM=c("purple"),fraction=0.05,level=0.5)
plot(variograms[[2]],CTMM=FITS_2,col.CTMM=c("blue"),fraction=0.65,level=0.5)
plot(variograms[[2]],CTMM=FITS_2,col.CTMM=c("blue"),fraction=0.05,level=0.5)
names(FITS_2) <- names(both.coyotes[1:2])
AKDE corrects for biases due to autocorrelation, small effective sample size, and irregular sampling in time
UDS_coyotes <- akde(both.coyotes[1:2],FITS_2)
plot(UDS_coyotes[[1]])
summary(UDS_coyotes[[1]])
plot(UDS_coyotes[[2]])
summary(UDS_coyotes[[2]])
This function calculates a useful measure of similarity between distributions evaluate overlap of 95% Home Range.The choice method=“BA” computes the Bhattacharyya’s affinity (BA). Values range from zero (no overlap) to 1 (identical Utilization Distributions).
95% level of confidence
overlap(UDS_coyotes)
plotting overlap
#use the plot function to look at the AKDE UD
#95% Home Range (95% is the default)
#The middle contour represent the maximum likelihood area where the animal spends 95% of its time.
plot(both.coyotes, UD = UDS_coyotes, col=rainbow(length(both.coyotes)))
Coyote P4_2 (in red) and coyote P5_2 (in blue) are tracked coyotes in different packs (4 and 5). We can see how these two packs separate themselves by the little overlap and clear separation of their space usage.
We can also estimate the CDE (conditional distribution of encounters) (Noonan et al., 2020). CDE is a concept describing the long-term encounter location probabilities for movement within home ranges.
CDE_coyotes <- encounter(UDS_coyotes)
plot(both.coyotes,
col=c("#FF0000", "#F2AD00"),
UD=CDE_coyotes,
col.DF="#046C9A",
col.grid = NA)
Literature Cited
C, J. M., Fleming, C. H., & Gurarie, E. (2016). ctmm: an r package for analyzing animal relocation data as a continuous-time stochastic process. Methods in Ecology and Evolution, 7(9), 1124–1132.
Codling, E. A., Plank, M. J., & Benhamou, S. (2008). Random walk models in biology. Journal of The Royal Society Interface, 5(25), 813–834. https://doi.org/10.1098/rsif.2008.0014
Fleming, C. H., Fagan, W. F., Mueller, T., Olson, K. A., Leimgruber, P., & Calabrese, J. M. (2015). Rigorous home range estimation with movement data: a new autocorrelated kernel density estimator. Ecology, 96(5), 1182-1188.
Fleming, C. H., & Calabrese, J. M. (2017). A new kernel density estimator for accurate home‐range and species‐range area estimation. Methods in Ecology and Evolution, 8(5), 571-579.
Kranstauber, B., Cameron, A., Weinzerl, R., Fountain, T., Tilak, S., Wikelski, M., & Kays, R. (2011). The Movebank data model for animal tracking. Environmental Modelling & Software, 26(6), 834-835.
Mrozewski, T. (2018). Movebank. Bulletin-Association of Canadian Map Libraries and Archives (ACMLA), (158), 24-27.
Nau, R. (2014, November 4). Notes on the random walk model - people.duke.edu. Notes on the random walk model. Retrieved 2021, from https://people.duke.edu/~rnau/Notes_on_the_random_walk_model--Robert_Nau.pdf